Correlations in Networks associated to Preferential Growth 
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Combinations of random and preferential growth for both on-growing and stationary networks 
are studied and a hierarchical topology is observed. Thus for real world scale-free networks which 
do not exhibit hierarchical features preferential growth is probably not the main ingredient in the 
growth process. An example of such real world networks includes the protein-protein interaction 
network in yeast, which exhibits pronounced anti-hierarchical features. 
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One feature that many complex networks show is 
scale-free degree distribution of vertices, that is the 
probability of finding a vertex of degree k follows 
P(k) oc /c~ 7 . A popular explanation for the scale-free 
degree distribution of vertices is preferential attachment 
Q- 13 in which new vertices tend to connect themselves 
to already highly connected vertices. In addition to 
the degree distribution there are additional topological 
measures that can be used to characterize networks, for 
example degree-degree correlations, that is "who is con- 
nected to who?" . An understanding of by what process 
networks emerge should then include an understanding 
of the corresponding topological measures both for real 
networks and for networks models HUSH. In partic- 
ular it has been observed that protein-protein networks 
have quite different degree-degree correlations than the 
Internet Q, although both molecular networks and the 
Internet show scale-free features. In the present paper 
we investigate versions of preferential attachment both 
for on-growing and stationary networks, and study the 
degree distribution and the degree-degree-correlations. 
Our conclusion is that preferential attachment is robust 
with respect to a hierarchical type of degree-degree 
correlations. As a consequence, real networks which 
do not have this type of degree-degree correlations are 
unlikely to have evolved by a version of preferential 
attachment. 



A network, or more formally a graph, G(V, E) con- 
sists of a set of vertices V and a set of edges E which 
connect pairs of vertices in the network. It can both be 
ordered and unordered pairs depending if the network is 
directed or not. We only consider undirected networks 
here. When generating such a network we consider four 
elementary processes: addition or removal of respectively 
vertices or edges. Here we use preferential attachment 
when adding new vertices or edges to the graph, either 
preferential attachment in itself or combined with ran- 
dom attachment. We will furthermore consider both a 



growing network, and a non growing network evolving 
by addition and removal of vertices and edges at steady 
state conditions. 



I. GROWING NETWORKS 

First let us consider a network grown to some number 
of vertices N, that we fix from the beginning (typically 
we use N — 10 3 ). The network grows to this size by a 
process where we at each step do the following: 

• With probability p a new vertex is added and con- 
nected with an edge to a preferentially selected ver- 
tex. 

• With probability 1 — p a new edge is added between 
two vertices which are 

— with probability q both chosen preferentially. 

— with probability 1 — q one vertex is chosen 
preferentially and the other vertex is randomly 
chosen. 

Double edges or loops are not allowed and therefore 
each time we add an edge to the network, a check is 
performed. If the connection is not valid, one attempts 
to put the edge somewhere else. This will always be 
possible, except for some non important cases where the 
network is very small. To have good statistics a number 
of networks are grown to the desired size N by the rules 
above. Also for every network that is produced one makes 
a sample of randomized networks with exactly the same 
degree distribution as the grown network, as described 
in y|. We look at the like- hood of having a connection 
between vertices of edge-degree K\ to vertices of edge- 
degree K2 in the real network and compare it with the 
probability of finding the same connection obtained in 
the random sample of network [j|: 
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The reason for comparing with a set of rewired networks 
is because of the inherently complicated nature of a net- 
work. So far the analytical approaches only applies to 
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networks where multiple edges between two vertices and 
loops are allowed. With a specific degree distribution and 
not allowing for loops or multiple edges between vertices 
there is a limited freedom when attaching edges. This 
restriction will give a preference to small vertices con- 
necting with large vertices in a scale-free network. In 
order to measure how the correlations in the created net- 
work differs from the one expected from a network with 
the same degree sequence we divide the number of spe- 
cific connections in the studied network with the number 
of connections in the randomized networks. In princi- 
ple one could also have obtained information about this 
"two-point" correlation from the measure of assortative 
mixing Js], compared with the set of randomized net- 
works, but the full correlation profile contains more spe- 
cific details about the topology. 




FIG. 1: Growing networks. Cumulative plot of the degree dis- 
tributions P(< k) for the networks on top and below the cor- 
relation profiles R(Ki, K2) for the respective networks. The 
number of vertices are 1000 for the correlation profiles and 
10000 for the degree distributions. The different networks 
above are generated with the parameters: (A) p — 0.4 and 
q = 0. (B) p — 0.4 and q = 1. (C) p = 0.8 and q = 1. 



In the figure n w e show the degree distribution of three 
differently grown networks labeled A, B and C. That is 
we consider growth with different rates of edge additions 
to vertex additions as quantified by p. Further, the given 
p = 0.4 corresponds to adding 4 vertices each with one 
edge attached preferentially, to every elementary addi- 
tion of 6 edges. The q = 1 and q = corresponds to 
preferential attachment of these G edges in both ends, 
respectively to attachment of one of their ends to a ran- 
domly selected vertex. In all cases one obtains scale free 
networks pjli"fT|: 



P(k) oc — 



(2) 



q = gives 7 = 2.86, (B) p — 0.4 with q = 1 gives 
7 = 2.14 and (C) p = 0.8 with q = 1 gives 7 = 2.6. 

Figure Q] (A-C) examines the correlation profile. The 
overall pattern is that in all cases highly connected ver- 
tices tend to connect to highly connected vertices, a fea- 
ture which in was associated with hierarchical topolo- 
gies of networks. Also an overall pattern, is that the 
more edges there are, the more R(Ki, K2) approaches 
unity and the hierarchical topologies thus tend to be 
suppressed by the overall noise. Examining the differ- 
ent types of growth, we furthermore see that the most 
hierarchical networks are obtained when edges are added 
randomly in one end and preferentially in the other end. 
In a somewhat similar vein assortativity was studied in 

Since analytical calculations usually are limited to only 
apply to networks where loops and multiple edges be- 
tween vertices are allowed, we also investigate what effect 
this has to the correlation profile by performing the same 
growth process as before, but with the difference that 
multiple edges and loops are accepted. Multiple edges 
and loops are accepted both in the process of growing the 
networks and the creation of the randomized networks. 
In figure |2 the correlation profile is visible. 





with exponent 7 that decreases with both p and q. For 
the three cases in figure^we have that (A) p = 0.4 with 



FIG. 2: Comparison of correlation profile for a network pro- 
cess (A) p — 0.4 and q = 0, and the same process with the 
only difference that double edges and loops are allowed, (B). 



In the figure |21 we see that, even if double edges and 
loops are allowed, indeed the highly connected vertices 
are connected more frequently to each other compared 
with a maximal randomization. However, the peak is 
shifted towards higher degrees because many edges are 
allowed between two vertices. If still more edges to ver- 
tices are inserted, the differences will be even larger be- 
cause of the number of double edges and loops that will 
be created. The preferential attachment is however not 
the full story of the correlations, there are more to it. In 
the process of preferential attachment, the oldest vertices 
tend to become the vertices of highest degree. Further- 
more, the insertion of an edge in the network connects 
two vertices created before the time of the edge insertion. 
This implicates that when the network is created and all 
vertex and edge insertions are made, more edges are put 
between older vertices than the younger vertices simply 
because the network is smaller in the early stages of the 
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growth process; thus older vertices have a higher proba- 
bility to be connected by an edge than the younger ver- 
tices created in the later stages. This explains why even 
if the edges are inserted randomly in both ends one gets 
a highly hierarchical structure, figure compared to 
what is expected from the resulting degree distribution. 
The degree distribution no longer follows a power law but 
is still fairly broad. Comparing to the the process where 
the excess edges are inserted after the insertion of all the 
vertices to the network, figure |3j3, we observe that the 
hierarchical structure is no longer as apparent as in fig- 
ure|3J\. Furthermore, as a consequence of inserting fewer 
edges to the older (more connected) vertices the degree 
distribution is not as broad as in|2K- 




FIG. 3: Cumulative plot of the degree distributions P(< k) 
for two network processes generating a network of N = 1000 
vertices. (A) is a process of preferentially vertex and edge 
insertions, with p — 0.5. The edges are inserted randomly in 
both ends. (B) a process where the excess edges are inserted 
randomly after all the vertices are inserted preferentially to 
the network. Fhe number of edges are the same for the two 
networks, M = 2000 



II. STATIONARY NETWORKS 



Given we want a network consisting of N vertices, we 
grow the network as before, but in addition add a removal 
step at any time the number of vertices exceeds N. The 
total algorithm then reads: 

• With probability p a new vertex is added and con- 
nected with an edge to a preferentially selected ver- 
tex. 

• With probability 1 — p a new edge is added between 
two vertices which are 

— with probability q both chosen preferentially. 

— with probability 1—q one vertex is chosen pref- 
erentially and the other vertex randomly cho- 
sen. 

• If ^vertices > N, remove a random vertex n and 
all vertices that after the removal of n becomes iso- 
lated. 

At given time-steps (typically at the order of the size of 
the network), randomizations of the network are made in 
order to calculate the degree-degree correlation profiles. 
Figure 0] demonstrates that now, with both growth and 
elimination, the scale invariance is broken. This is a strik- 
ing difference to the original Simon 13] model of "rich 
get richer" . In his model money was assigned to people 
stochastically with a probability given by their present 
wealth leading to a power law distribution of wealth. 
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Many real world networks are not constantly growing, 
but may anyway be governed by a growth process, that 
then should be supplemented by means of elimination of 
parts of the network. In the case of preferential growth 
the oldest vertices also become the most central ones 
which is shown in the degree-degree correlation profile 
^ It is therefore of interest to examine what happens 
if the oldest vertices may be randomly eliminated. This 
is investigated in the following steady state model for 
growth and elimination in networks. 



FIG. 4: Stationary networks. Cumulative plot of the degree 
distributions P(< K) for the networks on top and below the 
correlation profiles R[Ki, K2) for the respective networks. 
The number of vertices are 1000 for the correlation profiles 
and 10000 for the degree distributions. The different net- 
works above are generated with the parameters: (A) p — 0.4 
and <7 = 0. (B) p = 0.4 and q = l. (C) p = 0.8 and q = l. 



In his case one also obtains a power law distribution 
if one randomly eliminates agents independently of their 
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wealth, see also |14|. The reason for the different behavior 
in the network case is due to the fact that when one 
eliminates vertices with few connections, then with high 
probability one also reduces the number of connections 
for the vertices with high degrees. 

Considering the correlation profiles for the steady 
state networks one first of all notices that hierarchical 
features remain. Further, when comparing to the 
steadily growing networks the hierarchical features are 
suppressed. Also notice that for the high p or low q 
the degree distribution became close to exponential, a 
feature that in itself will diminish the importance of the 
edge degree as an informative characteristic of the vertex 
structure. However, the relative strength of observed 
correlations for steady state networks are qualitatively 
similar to what was obtained for the growing networks. 



In summary we have shown that preferential attach- 
ment and continuous edge insertions leads to a rather 
robust characteristic type of hierarchical degree-degree 
correlations. Thus for real world scale-free networks 
that does not exhibit hierarchical features, preferential 
growth is probably not the main ingredient in form- 
ing their topology. An example of such real world net- 
works includes the protein-protein interaction networks 
in yeast, which exhibits pronounced anti- hierarchical 
topology 0, EI • Thus the robustness of the hierarchi- 
cal topology that preferential attachment gives rise to, 
points to some difficult y in the preferential attachment 
scenario put forward in |l5| for protein-protein networks, 
not withstanding the fact that it was found that the older 
proteins were observed to be more connected than the 
younger ones. 
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